Resource Management for Distributed Parallel Systems
نویسندگان
چکیده
Multiprocessor systems should exist in the the larger context of distributed systems, allowing multiprocessor resources to be shared by those that need them. Unfortunately, typical multiprocessor resource management techniques do not scale to large networks. The Prospero Resource Manager (PRM) is a scalable resource allocation system that supports the allocation of processing resources in large networks and multiprocessor systems. To manage resources in such distributed parallel systems, PRM employs three types of managers: system managers, job managers, and node managers. There exist multiple independent instances of each type of manager, reducing bottlenecks. The complexity of each manager is further reduced because each is designed to utilize information at an appropriate level of abstraction.
منابع مشابه
Static Task Allocation in Distributed Systems Using Parallel Genetic Algorithm
Over the past two decades, PC speeds have increased from a few instructions per second to several million instructions per second. The tremendous speed of today's networks as well as the increasing need for high-performance systems has made researchers interested in parallel and distributed computing. The rapid growth of distributed systems has led to a variety of problems. Task allocation is a...
متن کاملA new Shuffled Genetic-based Task Scheduling Algorithm in Heterogeneous Distributed Systems
Distributed systems such as Grid- and Cloud Computing provision web services to their users in all of the world. One of the most important concerns which service providers encounter is to handle total cost of ownership (TCO). The large part of TCO is related to power consumption due to inefficient resource management. Task scheduling module as a key component can has drastic impact on both user...
متن کاملThe Autopilot Performance - Directed Adaptive
High-performance computing is rapidly expanding to include distributed collections of heterogeneous sequential and parallel systems and irregular applications with complex, data dependent execution behavior and time varying resource demands. To provide adaptive resource management for dynamic applications, we are developing the Autopilot toolkit. Autopilot provides a exible set of performance s...
متن کاملJust-in-time Transparent Resource Management in Distributed Systems
This paper presents the design and the implementation of a resource management system for monitoring computing resources on a network and for dynamically allocating them to concurrently executing jobs. In particular, it is designed to support adaptive parallel computations|computations that beneet from addition of new machines, and can tolerate removal of machines while executing. The challenge...
متن کاملA Framework for Adaptive Storage Input/Output on Computational Grids
Emerging computational grids consist of distributed collections of heterogeneous sequential and parallel systems and irregular applications with complex, data dependent execution behavior and time varying resource demands. To provide adaptive input/output resource management for these systems, we are developing PPFS II, a portable parallel le system. PPFS II supports rule-based, closed loop and...
متن کامل